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Understanding key structural properties of large scale networks are crucial for analyzing and 
optimizing their performance, and improving their reliability and security. Here we show that these 
networks possess a previously unnoticed feature, global curvature, which we argue has a major 
impact on core congestion: the load at the core of a network with N nodes scales as N 2 as compared 
to N for a flat network. We substantiate this claim through analysis of a collection of real data 
networks across the globe as measured and documented by previous researchers. 

PACS numbers: 



Large-scale data networks form the infrastructure for 
contemporary global communications. Increasingly, the 
trend in these networks is towards converged services over 
the Internet protocol, dynamic and automatic reconfig- 
ureability, and flatter architecture for fast service cre- 
ation and survivability. In such a large and fast changing 
environment, there is a need for identifying key structural 
properties that affect their performance, reliability and 
security and which provide efficient and scalable models 
to estimate these metrics reliably. 

Recent models of networks have focused on features 
such as their 'small world' property [l], [H, Q or power 
law degree distributions 0, [EL @- There has been evi- 
dence for power-law degree distributions in data networks 
at the IP layer 0], for the worldwide web [ll, and even 
for the virtual network of social connections [8| , but are 
found not to exist for physical networks such as electri- 
cal grids [1, H[ and some biological networks HEl. Al- 
though these features are interesting and important, the 
impact of intrinsic geometrical and topological features 
of large-scale networks on performance, reliability and 
security is of much greater importance. Intuitively, it is 
known that traffic between nodes tends to go through a 
relatively small core of the network [ll[ as if the shortest 
path between them is curved inwards. It has been sug- 
gested that this property may be due to global curvature 
of the network [12| . 

In this paper, we define the global (negative) curvature 
for finite networks and demonstrate its existence at the 
IP layer by exami ning topologies of numerous publicly 
available networks [13|]. A recent report [lj], also refers 
to curvature as a possible cause of some key observations 
about networks at the IP layer. However, these authors 
assume negative curvature, and construct a model with a 
few extra simple assumptions that shares various features 
with real networks such as a power law degree distribu- 
tion. By contrast, we demonstrate negative curvature 
through direct measurement. 

Turning to the impact of negative curvature, we focus 
on the load (also referred to as the betweenness central- 
ity), as defined by assuming unit traffic between each pair 
of nodes in the network with shortest path routing, and 



calculating the traffic through each node. (This is not 
the actual time-variable demand that is routed through 
nodes and links at the IP layer.) We show that net- 
work curvature or (5-hyperbolicity [l5| implies that the 
load at the core of the network scales with the number of 
nodes TV as ~ A 2 , which is faster than the ~ TV 15 scal- 
ing for flat networks. Thus core congestion is worse in 
hyperbolic networks, and geodesic routing achieved with 
greedy algorithms on hyperbolic networks [3| is actually 
problematic. Previous work [n| [I?], EH has considered 
the load as a function of node degree for fixed N, which 
we have also examined separately p^ j. 




FIG. 1: A rendering of the graph for the network 
7018(AT&T). 



Negative curvature of a geodesic metric space is de- 
fined by Gromov [ToT ] in terms of the '<5-Thin Triangle 
Condition'. For a graph, an appropriate metric can be 
used. For any three nodes (ijk), the geodesies gij,gjk 
and gki of lengths dij,djk and dki are constructed. A 
fourth node m is chosen, and the shortest distance be- 
tween 77i and all the nodes on (ij) is defined as d(m; ij). 
The distance D(m; ijk) is defined as the maximum of 



2 



d(m;ij),d(m;jk) and d(m;ki). Then if 

max min D(m; ijk) = 5 
(ijk) m 



(1) 



is finite, the (infinite) graph is said to have negative 
or hyperbolic curvature. Other definitions of curvature 
count the triangles (or other polygons) that meet at each 
vertex[20], but these define a local, not global, curvature 
and can be argued to be unrelated to the global perfor- 
mance of networks. 

For a finite graph, Eq.((T]) is trivially finite and the 
Gromov curvature has to be modified. We introduce the 
concept of the "curvature plot" of a network: for every 
triangle A = (ijk) we plot 5a vs La where 



5a = min D(m; ijk) 

m 

La = min[d(ij),d(Jk),d(ki)]. 



(2) 



This yields Pl(5), the probability distribution for 5 at 
fixed L. If the peak of Pl(5) is at 6 = 5 P (L), the net- 
work is flat (negatively curved) if 5 P (L) increases linearly 
(sublinearly) with L [21| . Since we use the peak of the 
distribution instead of the maximum as in Eq.([T]), statis- 
tical sampling of triangles is sufficient. 

Figure [5] shows the curvature plot for network 
7018(AT&T) from the Rocketfuel database [H[ (see Fig- 
ure [T]). The metric used is the 'hop metric', where each 
edge of the graph has unit length. This is a common met- 
ric that best illustrates the geometrical properties of the 
graph, including the 'small world' property [221 ] . The net- 
works in this database are at the IP layer and describe 
the IP port to IP port connectivity of the network. A 
sharp ridge is seen along the curve 5 p (L). The ridge is 
a straight line through the origin for the triangular lat- 
tice but bends over parallel to the L-axis for the 7018 
network (Pl(5) is zero for 5 > 3 for all L, though the 
diameter of the network is 12). For all the networks in 
the database, we have verified that the measured 6's do 
not exceed 3, even though the network diameters range 
from 12 to 14 (with the exception of 4755/VSNL whose 
diameter is 6, but whose ratio diameter/^ is even bigger, 
6). The ratio of 3/12 or 25% is comfortably within the 
theoretical bound for scaled hyperbolic graphs (24). 

As another manifestation of the curvature, Figure [3] 
shows the average 5 for each L, E[5](L), for all ten net- 
works in the Rocketfuel database. The plots saturate 
for relatively small L. The figure also shows E[5](L) for 
the Barabasi- Albert model [J] and a Watts-Strogatz type 
model although both of these models exhibit small 
world behavior, we see that only the first has negative 
curvature as defined in Eq.{T]). The plot for the Watts- 
Strogatz graph shows signs of saturation for large L, but 
the size of this graph was chosen so that it was already 
well in the small world regime 0, Q • 

Turning to the performance implications of hyperbolic 
curvature, the simplest graphs with (constant) negative 
curvature are the hyperbolic grids A Pj9 consisting of q 
regular p-gons at each vertex when (p— 2)(q — 2) > 4. [25( 
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FIG. 2: (a) Probability Pl(5) for randomly chosen triangles 
whose shortest side is L to have a given 5 as defined in Eq. ([2]) 
for the network 7018(AT&T network) which has 10152 nodes 
and 14319 links and diameter 12. The quantities 5 and L are 
restricted to integers, and the smooth plot is by interpolation, 
(b) Similar to (a), for a (flat) triangular lattice with 469 nodes 
and 1260 links. (The smaller number of nodes is sufficient for 
comparing with (a) since the range for L is large due to the 
absence of the small world effect.) 
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FIG. 3: The average 5 as a function of L, E[5](L), for the 10 
IP-layer networks studied here, and for the Barabasi- Albert 
model with k = 2 and TV = 10000 (11th curve) and the hy- 
perbolic grid Xz,7 (12th curve). On the other hand, a Watts- 
Strogatz type model on a square lattice with TV = 6400, open 
boundary conditions and 5% extra random connections (13th 
curve) and two flat grids (the triangular lattice with diameter 
29 and the square lattice with diameter 154) are also shown. 



FIG. 4: Plot of the maximum load L C (N) for each network in 
the Rocketfuel database as a function of the number of nodes 
TV in the network. Also shown are the maximum load for the 
hyperbolic grid Xsj, the Barabasi- Albert model with k = 2, 
the Watts-Strogatz model and a triangular lattice, for various 
TV. The dashed lines have slopes of 2.0 and 1.5, corresponding 
to the hyperbolic and Euclidean cases respectively. 



(When (p — 2)(<j — 2) = 4, the graph is flat.) We construct 
finite hyperbolic grids by truncating to n hops from the 
center. The number of nodes TV in the graph increases ex- 
ponentially as n is increased. With unit demand between 
all node pairs and the traffic between two nodes travel- 
ing along a geodesic connecting them (evenly distributed 
over all geodesies in case of ties), we have verified numer- 
ically that the load at the center of the graph scales with 
the number of nodes TV in the graph as 



Erdos-Renyi 
Barabasi-Albert model 
model 



L C (TV) ~ TV 2 



(3) 



The same result can be obtained analytically for the con- 
tinuum Poincare disk truncated to a radius r < 1, con- 
verted to a graph by introducing a uniform distribution 
of nodes with each node connected to its neighbors. [27j 
By contrast, it is not hard to verify that L C (N) ~ TV 15 
for a Euclidean graph. Physically, this is because the 
traffic from the ~ TV nodes on the left of a Euclidean 
lattice to the ~ TV nodes on the right flows through the 
center across a line of length ~ vTV, whereas for a hyper- 
bolic graph it is pulled inwards and flows within anO(l) 
distance from the center. Figure 2] shows the load at the 
node with the highest load versus TV for all the networks 
in the Rocketfuel database, demonstrating ~ TV 2 scaling. 
The figure also shows results for the Barabasi-Albert and 
Watts-Strogatz models; we see that the first shows ~ TV 2 
scaling but the second does not, confirming our earlier 
conclusion that the latter is a poorer fit to Internet-type 
large-scale networks. 

There are two points worth noting. First, one might 
wonder whether the concentration of geodesies and load 
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FIG. 5: Taxonomy: Taxonomy of key characteristics of net- 
works and their overlaps in a schematic diagram. "Hairy" as 
used in this figure, refers to the simple mechanism of mak- 
ing a grid power-law by adding to each node a set of singly- 
connected nodes (hairs) whose number is drawn from any 
desired power-law distribution. PLDD refers to power law 
degree distributions. 



near the center is trivial because the networks we have 
studied are almost simple trees. However, the ratio of 
the number of edges to nodes in these networks ranges 
from 1.27 to 2.72, showing that they are far from be- 
ing trees. Second, as the example of hyperbolic grids 
demonstrate, one can construct graphs where every node 
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has the same degree, but which exhibit the 'small world' 
property and show iV 2 scaling of load. Thus although 
the networks we have studied do seem to have power- 
law degree distributions, hyperbolicity is a nontrivial and 
general property that is distinct from their degree distri- 
bution and — based on the ~ N 2 scaling of the previous 
paragraph — can significantly impact performance. Fig- 
ure [5] summarizes the relationship between several key 
characteristics discussed in the literature in the context 
of large-scale complex networks. We observe that hy- 
perbolicity entails small world behavior, a fundamental 
property of networks. 

Our results suggest that, counter-balancing the pos- 
itive benefits of hyperbolicity such as the small world 
property, core congestion is a structural problem due such 



hyperbolicity that grows more acute as the network grows 
in size. As long as routing protocols use geodesies in one 
form or another, whether in intra-domain, inter-domain 
or other forms of routing, congestion is a natural con- 
sequence of this intrinsic structural feature of networks. 
Using '(1 + e) routing', in which traffic between nodes 
is not routed along the geodesic(s) between them but is 
deliberately sent on slightly longer paths, would in fact 
alleviate core congestion. This is a phenomenon familiar 
from vehicular traffic: the shortest routes using express- 
ways can become so overcrowded that indirect and longer 
paths through backroads become faster. 

This work was funded by AFOSR grant FA9550-08-1- 
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